-
Couldn't load subscription status.
- Fork 10.4k
[V3] convert nodes_controlnet.py to V3 schema #10202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[V3] convert nodes_controlnet.py to V3 schema #10202
Conversation
|
+label: Core |
|
To avoid code duplication, we should probably extract the contents of controlnet nodes into their own functions, and then we can reference it from all controlnet nodes. It should not break any custom nodes that are importing the nodes from nodes.py as well. |
f3bd8b5 to
d0b4966
Compare
|
I have force pushed g the new version without duplicating or extracting code, where we simply call the old node without inheritance. What do you think of the new version? |
|
it looks ok. the nodes inside node.py will be the last to be converted (and at an unknown time), so this should do for now. in the long term we'd probably want to extract the controlnet function and make it generalizable for 'n' amount of conditioning. i'll start sending out notices to node authors for nodes_controlnet.py |
d0b4966 to
d3e43bc
Compare
|
added |
|
Node authors notified, will merge on Friday October 17th. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The time has come.
* Fix lowvram issue with hunyuan image vae. (comfyanonymous#9794) * add StabilityAudio API nodes (comfyanonymous#9749) * ComfyUI version v0.3.58 * add new ByteDanceSeedream (4.0) node (comfyanonymous#9802) * Update template to 0.1.78 (comfyanonymous#9806) * Update template to 0.1.77 * Update template to 0.1.78 * ComfyUI version 0.3.59 * Support hunyuan image distilled model. (comfyanonymous#9807) * Update template to 0.1.81 (comfyanonymous#9811) * Fast preview for hunyuan image. (comfyanonymous#9814) * Implement hunyuan image refiner model. (comfyanonymous#9817) * Add Output to V3 Combo type to match what is possible with V1 (comfyanonymous#9813) * Bump frontend to 1.26.11 (comfyanonymous#9809) * Add noise augmentation to hunyuan image refiner. (comfyanonymous#9831) This was missing and should help with colors being blown out. * Fix hunyuan refiner blownout colors at noise aug less than 0.25 (comfyanonymous#9832) * Set default hunyuan refiner shift to 4.0 (comfyanonymous#9833) * add kling-v2-1 model to the KlingStartEndFrame node (comfyanonymous#9630) * convert Minimax API nodes to the V3 schema (comfyanonymous#9693) * convert WanCameraEmbedding node to V3 schema (comfyanonymous#9714) * convert Cosmos nodes to V3 schema (comfyanonymous#9721) * convert nodes_cond.py to V3 schema (comfyanonymous#9719) * convert CFG nodes to V3 schema (comfyanonymous#9717) * convert Canny node to V3 schema (comfyanonymous#9743) * convert Moonvalley API nodes to the V3 schema (comfyanonymous#9698) * Better way of doing the generator for the hunyuan image noise aug. (comfyanonymous#9834) * Enable Runtime Selection of Attention Functions (comfyanonymous#9639) * Looking into a @wrap_attn decorator to look for 'optimized_attention_override' entry in transformer_options * Created logging code for this branch so that it can be used to track down all the code paths where transformer_options would need to be added * Fix memory usage issue with inspect * Made WAN attention receive transformer_options, test node added to wan to test out attention override later * Added **kwargs to all attention functions so transformer_options could potentially be passed through * Make sure wrap_attn doesn't make itself recurse infinitely, attempt to load SageAttention and FlashAttention if not enabled so that they can be marked as available or not, create registry for available attention * Turn off attention logging for now, make AttentionOverrideTestNode have a dropdown with available attention (this is a test node only) * Make flux work with optimized_attention_override * Add logs to verify optimized_attention_override is passed all the way into attention function * Make Qwen work with optimized_attention_override * Made hidream work with optimized_attention_override * Made wan patches_replace work with optimized_attention_override * Made SD3 work with optimized_attention_override * Made HunyuanVideo work with optimized_attention_override * Made Mochi work with optimized_attention_override * Made LTX work with optimized_attention_override * Made StableAudio work with optimized_attention_override * Made optimized_attention_override work with ACE Step * Made Hunyuan3D work with optimized_attention_override * Make CosmosPredict2 work with optimized_attention_override * Made CosmosVideo work with optimized_attention_override * Made Omnigen 2 work with optimized_attention_override * Made StableCascade work with optimized_attention_override * Made AuraFlow work with optimized_attention_override * Made Lumina work with optimized_attention_override * Made Chroma work with optimized_attention_override * Made SVD work with optimized_attention_override * Fix WanI2VCrossAttention so that it expects to receive transformer_options * Fixed Wan2.1 Fun Camera transformer_options passthrough * Fixed WAN 2.1 VACE transformer_options passthrough * Add optimized to get_attention_function * Disable attention logs for now * Remove attention logging code * Remove _register_core_attention_functions, as we wouldn't want someone to call that, just in case * Satisfy ruff * Remove AttentionOverrideTest node, that's something to cook up for later * Hunyuan refiner vae now works with tiled. (comfyanonymous#9836) * Support wav2vec base models (comfyanonymous#9637) * Support wav2vec base models * trim trailing whitespace * Do interpolation after * Cleanup. (comfyanonymous#9838) * Remove single quote pattern to avoid wrong matches (comfyanonymous#9842) * Add support for Chroma Radiance (comfyanonymous#9682) * Initial Chroma Radiance support * Minor Chroma Radiance cleanups * Update Radiance nodes to ensure latents/images are on the intermediate device * Fix Chroma Radiance memory estimation. * Increase Chroma Radiance memory usage factor * Increase Chroma Radiance memory usage factor once again * Ensure images are multiples of 16 for Chroma Radiance Add batch dimension and fix channels when necessary in ChromaRadianceImageToLatent node * Tile Chroma Radiance NeRF to reduce memory consumption, update memory usage factor * Update Radiance to support conv nerf final head type. * Allow setting NeRF embedder dtype for Radiance Bump Radiance nerf tile size to 32 Support EasyCache/LazyCache on Radiance (maybe) * Add ChromaRadianceStubVAE node * Crop Radiance image inputs to multiples of 16 instead of erroring to be in line with existing VAE behavior * Convert Chroma Radiance nodes to V3 schema. * Add ChromaRadianceOptions node and backend support. Cleanups/refactoring to reduce code duplication with Chroma. * Fix overriding the NeRF embedder dtype for Chroma Radiance * Minor Chroma Radiance cleanups * Move Chroma Radiance to its own directory in ldm Minor code cleanups and tooltip improvements * Fix Chroma Radiance embedder dtype overriding * Remove Radiance dynamic nerf_embedder dtype override feature * Unbork Radiance NeRF embedder init * Remove Chroma Radiance image conversion and stub VAE nodes Add a chroma_radiance option to the VAELoader builtin node which uses comfy.sd.PixelspaceConversionVAE Add a PixelspaceConversionVAE to comfy.sd for converting BHWC 0..1 <-> BCHW -1..1 * Changes to the previous radiance commit. (comfyanonymous#9851) * Make ModuleNotFoundError ImportError instead (comfyanonymous#9850) * Add that hunyuan image is supported to readme. (comfyanonymous#9857) * Support the omnigen2 umo lora. (comfyanonymous#9886) * Fix depending on asserts to raise an exception in BatchedBrownianTree and Flash attn module (comfyanonymous#9884) Correctly handle the case where w0 is passed by kwargs in BatchedBrownianTree * Add encoder part of whisper large v3 as an audio encoder model. (comfyanonymous#9894) Not useful yet but some models use it. * Reduce Peak WAN inference VRAM usage (comfyanonymous#9898) * flux: Do the xq and xk ropes one at a time This was doing independendent interleaved tensor math on the q and k tensors, leading to the holding of more than the minimum intermediates in VRAM. On a bad day, it would VRAM OOM on xk intermediates. Do everything q and then everything k, so torch can garbage collect all of qs intermediates before k allocates its intermediates. This reduces peak VRAM usage for some WAN2.2 inferences (at least). * wan: Optimize qkv intermediates on attention As commented. The former logic computed independent pieces of QKV in parallel which help more inference intermediates in VRAM spiking VRAM usage. Fully roping Q and garbage collecting the intermediates before touching K reduces the peak inference VRAM usage. * Support the HuMo model. (comfyanonymous#9903) * Support the HuMo 17B model. (comfyanonymous#9912) * Enable fp8 ops by default on gfx1200 (comfyanonymous#9926) * make kernel of same type as image to avoid mismatch issues (comfyanonymous#9932) * Do padding of audio embed in model for humo for more flexibility. (comfyanonymous#9935) * Bump frontend to 1.26.13 (comfyanonymous#9933) * Basic WIP support for the wan animate model. (comfyanonymous#9939) * api_nodes: reduce default timeout from 7 days to 2 hours (comfyanonymous#9918) * fix(seedream4): add flag to ignore error on partial success (comfyanonymous#9952) * Update WanAnimateToVideo to more easily extend videos. (comfyanonymous#9959) * Add inputs for character replacement to the WanAnimateToVideo node. (comfyanonymous#9960) * [Reviving comfyanonymous#5709] Add strength input to Differential Diffusion (comfyanonymous#9957) * Update nodes_differential_diffusion.py * Update nodes_differential_diffusion.py * Make strength optional to avoid validation errors when loading old workflows, adjust step --------- Co-authored-by: ThereforeGames <[email protected]> * Fix LoRA Trainer bugs with FP8 models. (comfyanonymous#9854) * Fix adapter weight init * Fix fp8 model training * Avoid inference tensor * Lower wan memory estimation value a bit. (comfyanonymous#9964) Previous pr reduced the peak memory requirement. * Set some wan nodes as no longer experimental. (comfyanonymous#9976) * Support for qwen edit plus model. Use the new TextEncodeQwenImageEditPlus. (comfyanonymous#9986) * add offset param (comfyanonymous#9977) * Fix bug with WanAnimateToVideo node. (comfyanonymous#9988) * Fix bug with WanAnimateToVideo. (comfyanonymous#9990) * update template to 0.1.86 (comfyanonymous#9998) * update template to 0.1.84 * update template to 0.1.85 * Update template to 0.1.86 * feat(api-nodes): add wan t2i, t2v, i2v nodes (comfyanonymous#9996) * ComfyUI version 0.3.60 * Rodin3D - add [Rodin3D Gen-2 generate] api-node (comfyanonymous#9994) * update Rodin api node * update rodin3d gen2 api node * fix images limited bug * Add new audio nodes (comfyanonymous#9908) * Add new audio nodes - TrimAudioDuration - SplitAudioChannels - AudioConcat - AudioMerge - AudioAdjustVolume * Update nodes_audio.py * Add EmptyAudio -node * Change duration to Float (allows sub seconds) * Fix issue with .view() in HuMo. (comfyanonymous#10014) * Fix memory leak by properly detaching model finalizer (comfyanonymous#9979) When unloading models in load_models_gpu(), the model finalizer was not being explicitly detached, leading to a memory leak. This caused linear memory consumption increase over time as models are repeatedly loaded and unloaded. This change prevents orphaned finalizer references from accumulating in memory during model switching operations. * Make LatentCompositeMasked work with basic video latents. (comfyanonymous#10023) * Fix the failing unit test. (comfyanonymous#10037) * Add @Kosinkadink as code owner (comfyanonymous#10041) Updated CODEOWNERS to include @Kosinkadink as a code owner. * convert nodes_rebatch.py to V3 schema (comfyanonymous#9945) * convert nodes_fresca.py to V3 schema (comfyanonymous#9951) * convert nodes_sdupscale.py to V3 schema (comfyanonymous#9943) * convert nodes_tcfg.py to V3 schema (comfyanonymous#9942) * convert nodes_sag.py to V3 schema (comfyanonymous#9940) * convert nodes_post_processing to V3 schema (comfyanonymous#9491) * convert CLIPTextEncodeSDXL nodes to V3 schema (comfyanonymous#9716) * Don't add template to qwen2.5vl when template is in prompt. (comfyanonymous#10043) Make the hunyuan image refiner template_end 36. * Add 'input_cond' and 'input_uncond' to the args dictionary passed into sampler_cfg_function (comfyanonymous#10044) * Update template to 0.1.88 (comfyanonymous#10046) * Add workflow templates version tracking to system_stats (comfyanonymous#9089) Adds installed and required workflow templates version information to the /system_stats endpoint, allowing the frontend to detect and notify users when their templates package is outdated. - Add get_installed_templates_version() and get_required_templates_version() methods to FrontendManager - Include templates version info in system_stats response - Add comprehensive unit tests for the new functionality * convert nodes_hidream.py to V3 schema (comfyanonymous#9946) * convert nodes_bfl.py to V3 schema (comfyanonymous#10033) * convert nodes_luma.py to V3 schema (comfyanonymous#10030) * convert nodes_pixart.py to V3 schema (comfyanonymous#10019) * convert nodes_photomaker.py to V3 schema (comfyanonymous#10017) * convert nodes_qwen.py to V3 schema (comfyanonymous#10049) * Reduce Peak WAN inference VRAM usage - part II (comfyanonymous#10062) * flux: math: Use _addcmul to avoid expensive VRAM intermediate The rope process can be the VRAM peak and this intermediate for the addition result before releasing the original can OOM. addcmul_ it. * wan: Delete the self attention before cross attention This saves VRAM when the cross attention and FFN are in play as the VRAM peak. * Improvements to the stable release workflow. (comfyanonymous#10065) * Fix typo in release workflow. (comfyanonymous#10066) * convert nodes_lotus.py to V3 schema (comfyanonymous#10057) * convert nodes_lumina2.py to V3 schema (comfyanonymous#10058) * convert nodes_hypertile.py to V3 schema (comfyanonymous#10061) * feat: ComfyUI can be run on the specified Ascend NPU (comfyanonymous#9663) * feature: Set the Ascend NPU to use a single one * Enable the `--cuda-device` parameter to support both CUDA and Ascend NPUs simultaneously. * Make the code just set the ASCENT_RT_VISIBLE_DEVICES environment variable without any other edits to master branch --------- Co-authored-by: Jedrzej Kosinski <[email protected]> * Fix stable workflow creating multiple draft releases. (comfyanonymous#10067) * Update command to install latest nighly pytorch. (comfyanonymous#10085) * [Rodin3d api nodes] Updated the name of the save file path (changed from timestamp to UUID). (comfyanonymous#10011) * Update savepath name from time to uuid * delete lib * Update template to 0.1.91 (comfyanonymous#10096) * add WanImageToImageApi node (comfyanonymous#10094) * convert nodes_mochi.py to V3 schema (comfyanonymous#10069) * convert nodes_perpneg.py to V3 schema (comfyanonymous#10081) * dont cache new locale entry points (comfyanonymous#10101) * convert nodes_mahiro.py to V3 schema (comfyanonymous#10070) * Add action to create cached deps with manually specified torch. (comfyanonymous#10102) * Make the final release test optional in the stable release action. (comfyanonymous#10103) * Different base files for different release. (comfyanonymous#10104) * Different base files for nvidia and amd portables. (comfyanonymous#10105) * Add a way to have different names for stable nvidia portables. (comfyanonymous#10106) * Add action to do the full stable release. (comfyanonymous#10107) * Make stable release workflow callable. (comfyanonymous#10108) * Add basic readme for AMD portable. (comfyanonymous#10109) * ComfyUI version 0.3.61 * Workflow permission fix. (comfyanonymous#10110) * Add new portable links to readme. (comfyanonymous#10112) * fix(Rodin3D-Gen2): missing "task_uuid" parameter (comfyanonymous#10128) * enable Seedance Pro model in the FirstLastFrame node (comfyanonymous#10120) * ComfyUI version 0.3.62. * Bump frontend to 1.27.7 (comfyanonymous#10133) * convert nodes_audio_encoder.py to V3 schema (comfyanonymous#10123) * convert nodes_gits.py to V3 schema (comfyanonymous#9949) * convert nodes_differential_diffusion.py to V3 schema (comfyanonymous#10056) * convert nodes_optimalsteps.py to V3 schema (comfyanonymous#10074) * convert nodes_pag.py to V3 schema (comfyanonymous#10080) * convert nodes_lt.py to V3 schema (comfyanonymous#10084) * convert nodes_ip2p.pt to V3 schema (comfyanonymous#10097) * Support the new hunyuan vae. (comfyanonymous#10150) * feat: Add Epsilon Scaling node for exposure bias correction (comfyanonymous#10132) * sd: fix VAE tiled fallback VRAM leak (comfyanonymous#10139) When the VAE catches this VRAM OOM, it launches the fallback logic straight from the exception context. Python however refs the entire call stack that caused the exception including any local variables for the sake of exception report and debugging. In the case of tensors, this can hold on the references to GBs of VRAM and inhibit the VRAM allocated from freeing them. So dump the except context completely before going back to the VAE via the tiler by getting out of the except block with nothing but a flag. The greately increases the reliability of the tiler fallback, especially on low VRAM cards, as with the bug, if the leak randomly leaked more than the headroom needed for a single tile, the tiler would fallback would OOM and fail the flow. * WAN: Fix cache VRAM leak on error (comfyanonymous#10141) If this suffers an exception (such as a VRAM oom) it will leave the encode() and decode() methods which skips the cleanup of the WAN feature cache. The comfy node cache then ultimately keeps a reference this object which is in turn reffing large tensors from the failed execution. The feature cache is currently setup at a class variable on the encoder/decoder however, the encode and decode functions always clear it on both entry and exit of normal execution. Its likely the design intent is this is usable as a streaming encoder where the input comes in batches, however the functions as they are today don't support that. So simplify by bringing the cache back to local variable, so that if it does VRAM OOM the cache itself is properly garbage when the encode()/decode() functions dissappear from the stack. * Add a .bat to the AMD portable to disable smart memory. (comfyanonymous#10153) * convert nodes_morphology.py to V3 schema (comfyanonymous#10159) * fix(api-nodes): made logging path to be smaller (comfyanonymous#10156) * Turn on TORCH_ROCM_AOTRITON_ENABLE_EXPERIMENTAL by default. (comfyanonymous#10168) * update example_node to use V3 schema (comfyanonymous#9723) * feat(linter, api-nodes): add pylint for comfy_api_nodes folder (comfyanonymous#10157) * feat(api-nodes): add kling-2-5-turbo to txt2video and img2video nodes (comfyanonymous#10155) * fix(api-nodes): reimport of base64 in Gemini node (comfyanonymous#10181) * fix(api-nodes): bad indentation in Recraft API node function (comfyanonymous#10175) * convert nodes_torch_compile.py to V3 schema (comfyanonymous#10173) * convert nodes_eps.py to V3 schema (comfyanonymous#10172) * convert nodes_pixverse.py to V3 schema (comfyanonymous#10177) * convert nodes_tomesd.py to V3 schema (comfyanonymous#10180) * convert nodes_edit_model.py to V3 schema (comfyanonymous#10147) * Fix type annotation syntax in MotionEncoder_tc __init__ (comfyanonymous#10186) ## Summary Fixed incorrect type hint syntax in `MotionEncoder_tc.__init__()` parameter list. ## Changes - Line 647: Changed `num_heads=int` to `num_heads: int` - This corrects the parameter annotation from a default value assignment to proper type hint syntax ## Details The parameter was using assignment syntax (`=`) instead of type annotation syntax (`:`), which would incorrectly set the default value to the `int` class itself rather than annotating the expected type. * Update amd nightly command in readme. (comfyanonymous#10189) * Add instructions to install nightly AMD pytorch for windows. (comfyanonymous#10190) * Add instructions to install nightly AMD pytorch for windows. * Update README.md * fix(api-nodes): enable 2 more pylint rules, removed non needed code (comfyanonymous#10192) * convert nodes_rodin.py to V3 schema (comfyanonymous#10195) * convert nodes_stable3d.py to V3 schema (comfyanonymous#10204) * Remove soundfile dependency. No more torchaudio load or save. (comfyanonymous#10210) * fix(api-nodes): disable "std" mode for Kling2.5-turbo (comfyanonymous#10212) * Remove useless code. (comfyanonymous#10223) * Update template to 0.1.93 (comfyanonymous#10235) * Update template to 0.1.92 * Update template to 0.1.93 * ComfyUI version 0.3.63 * fix(api-nodes): enable more pylint rules (comfyanonymous#10213) * fix(api-nodes): allow negative_prompt PixVerse to be multiline (comfyanonymous#10196) * convert nodes_pika.py to V3 schema (comfyanonymous#10216) * convert nodes_kling.py to V3 schema (comfyanonymous#10236) * Implement gemma 3 as a text encoder. (comfyanonymous#10241) Not useful yet. * fix(ReCraft-API-node): allow custom multipart parser to return FormData (comfyanonymous#10244) * feat(api-nodes): add Sora2 API node (comfyanonymous#10249) * Temp fix for LTXV custom nodes. (comfyanonymous#10251) * Bump frontend to 1.27.10 (comfyanonymous#10252) * update template to 0.1.94 (comfyanonymous#10253) * ComfyUI version 0.3.64 * feat(V3-io): allow Enum classes for Combo options (comfyanonymous#10237) * Refactor model sampling sigmas code. (comfyanonymous#10250) * Mvly/node update (comfyanonymous#10042) * updated V2V node to allow for control image input exposing steps in v2v fixing guidance_scale as input parameter TODO: allow for motion_intensity as input param. * refactor: comment out unsupported resolution and adjust default values in video nodes * set control_after_generate * adding new defaults * fixes * changed control_after_generate back to True * changed control_after_generate back to False --------- Co-authored-by: thorsten <[email protected]> * feat(api-nodes, pylint): use lazy formatting in logging functions (comfyanonymous#10248) * convert nodes_model_downscale.py to V3 schema (comfyanonymous#10199) * convert nodes_lora_extract.py to V3 schema (comfyanonymous#10182) * convert nodes_compositing.py to V3 schema (comfyanonymous#10174) * convert nodes_latent.py to V3 schema (comfyanonymous#10160) * More surgical fix for comfyanonymous#10267 (comfyanonymous#10276) * fix(v3,api-nodes): V3 schema typing; corrected Pika API nodes (comfyanonymous#10265) * convert nodes_sd3.py and nodes_slg.py to V3 schema (comfyanonymous#10162) * Fix bug with applying loras on fp8 scaled without fp8 ops. (comfyanonymous#10279) * convert nodes_flux to V3 schema (comfyanonymous#10122) * convert nodes_upscale_model.py to V3 schema (comfyanonymous#10149) * Fix save audio nodes saving mono audio as stereo. (comfyanonymous#10289) * feat(GeminiImage-ApiNode): add aspect_ratio and release version of model (comfyanonymous#10255) * feat(api-nodes): add price extractor feature; small fixes to Kling & Pika nodes (comfyanonymous#10284) * Update template to 0.1.95 (comfyanonymous#10294) * Implement the mmaudio VAE. (comfyanonymous#10300) * Improve AMD performance. (comfyanonymous#10302) I honestly have no idea why this improves things but it does. * Update node docs to 0.3.0 (comfyanonymous#10318) * update extra models paths example (comfyanonymous#10316) * Update the extra_model_paths.yaml.example (comfyanonymous#10319) * Always set diffusion model to eval() mode. (comfyanonymous#10331) * add indent=4 kwarg to json.dumps() (comfyanonymous#10307) * WAN2.2: Fix cache VRAM leak on error (comfyanonymous#10308) Same change pattern as 7e8dd27 applied to WAN2.2 If this suffers an exception (such as a VRAM oom) it will leave the encode() and decode() methods which skips the cleanup of the WAN feature cache. The comfy node cache then ultimately keeps a reference this object which is in turn reffing large tensors from the failed execution. The feature cache is currently setup at a class variable on the encoder/decoder however, the encode and decode functions always clear it on both entry and exit of normal execution. Its likely the design intent is this is usable as a streaming encoder where the input comes in batches, however the functions as they are today don't support that. So simplify by bringing the cache back to local variable, so that if it does VRAM OOM the cache itself is properly garbage when the encode()/decode() functions dissappear from the stack. * convert nodes_hunyuan.py to V3 schema (comfyanonymous#10136) * Enable RDNA4 pytorch attention on ROCm 7.0 and up. (comfyanonymous#10332) * Fix loading old stable diffusion ckpt files on newer numpy. (comfyanonymous#10333) * Better memory estimation for the SD/Flux VAE on AMD. (comfyanonymous#10334) * ComfyUI version 0.3.65 * Faster workflow cancelling. (comfyanonymous#10301) * Python 3.14 instructions. (comfyanonymous#10337) * api-nodes: fixed dynamic pricing format; import comfy_io directly (comfyanonymous#10336) * Bump frontend to 1.28.6 (comfyanonymous#10345) * gfx942 doesn't support fp8 operations. (comfyanonymous#10348) * Add TemporalScoreRescaling node (comfyanonymous#10351) * Add TemporalScoreRescaling node * Mention image generation in tsr_k's tooltip * feat(api-nodes): add Veo3.1 model (comfyanonymous#10357) * Latest pytorch stable is cu130 (comfyanonymous#10361) * Fix order of inputs nested merge_nested_dicts (comfyanonymous#10362) * refactor: Replace manual patches merging with merge_nested_dicts (comfyanonymous#10360) * Bump frontend to 1.28.7 (comfyanonymous#10364) * feat: deprecated API alert (comfyanonymous#10366) * fix(api-nodes): remove "veo2" model from Veo3 node (comfyanonymous#10372) * Workaround for nvidia issue where VAE uses 3x more memory on torch 2.9 (comfyanonymous#10373) * workaround also works on cudnn 91200 (comfyanonymous#10375) * Do batch_slice in EasyCache's apply_cache_diff (comfyanonymous#10376) * execution: fold in dependency aware caching / Fix --cache-none with loops/lazy etc (comfyanonymous#10368) * execution: fold in dependency aware caching This makes --cache-none compatiable with lazy and expanded subgraphs. Currently the --cache-none option is powered by the DependencyAwareCache. The cache attempts to maintain a parallel copy of the execution list data structure, however it is only setup once at the start of execution and does not get meaninigful updates to the execution list. This causes multiple problems when --cache-none is used with lazy and expanded subgraphs as the DAC does not accurately update its copy of the execution data structure. DAC has an attempt to handle subgraphs ensure_subcache however this does not accurately connect to nodes outside the subgraph. The current semantics of DAC are to free a node ASAP after the dependent nodes are executed. This means that if a subgraph refs such a node it will be requed and re-executed by the execution_list but DAC wont see it in its to-free lists anymore and leak memory. Rather than try and cover all the cases where the execution list changes from inside the cache, move the while problem to the executor which maintains an always up-to-date copy of the wanted data-structure. The executor now has a fast-moving run-local cache of its own. Each _to node has its own mini cache, and the cache is unconditionally primed at the time of add_strong_link. add_strong_link is called for all of static workflows, lazy links and expanded subgraphs so its the singular source of truth for output dependendencies. In the case of a cache-hit, the executor cache will hold the non-none value (it will respect updates if they happen somehow as well). In the case of a cache-miss, the executor caches a None and will wait for a notification to update the value when the node completes. When a node completes execution, it simply releases its mini-cache and in turn its strong refs on its direct anscestor outputs, allowing for ASAP freeing (same as the DependencyAwareCache but a little more automatic). This now allows for re-implementation of --cache-none with no cache at all. The dependency aware cache was also observing the dependency sematics for the objects and UI cache which is not accurate (this entire logic was always outputs specific). This also prepares for more complex caching strategies (such as RAM pressure based caching), where a cache can implement any freeing strategy completely independently of the DepedancyAwareness requirement. * main: re-implement --cache-none as no cache at all The execution list now tracks the dependency aware caching more correctly that the DependancyAwareCache. Change it to a cache that does nothing. * test_execution: add --cache-none to the test suite --cache-none is now expected to work universally. Run it through the full unit test suite. Propagate the server parameterization for whether or not the server is capabale of caching, so that the minority of tests that specifically check for cache hits can if else. Hard assert NOT caching in the else to give some coverage of --cache-none expected behaviour to not acutally cache. * convert nodes_controlnet.py to V3 schema (comfyanonymous#10202) * Update Python 3.14 installation instructions (comfyanonymous#10385) Removed mention of installing pytorch nightly for Python 3.14. * Disable torch compiler for cast_bias_weight function (comfyanonymous#10384) * Disable torch compiler for cast_bias_weight function * Fix torch compile. * Turn off cuda malloc by default when --fast autotune is turned on. (comfyanonymous#10393) * Fix batch size above 1 giving bad output in chroma radiance. (comfyanonymous#10394) * Speed up chroma radiance. (comfyanonymous#10395) * Pytorch is stupid. (comfyanonymous#10398) * Deprecation warning on unused files (comfyanonymous#10387) * only warn for unused files * include internal extensions * Update template to 0.2.1 (comfyanonymous#10413) * Update template to 0.1.97 * Update template to 0.2.1 * Log message for cudnn disable on AMD. (comfyanonymous#10418) * Revert "execution: fold in dependency aware caching / Fix --cache-none with l…" (comfyanonymous#10422) This reverts commit b1467da. * ComfyUI version v0.3.66 * Only disable cudnn on newer AMD GPUs. (comfyanonymous#10437) * Add custom node published subgraphs endpoint (comfyanonymous#10438) * Add get_subgraphs_dir to ComfyExtension and PUBLISHED_SUBGRAPH_DIRS to nodes.py * Created initial endpoints, although the returned paths are a bit off currently * Fix path and actually return real data * Sanitize returned /api/global_subgraphs entries * Remove leftover function from early prototyping * Remove added whitespace * Add None check for sanitize_entry * execution: fold in dependency aware caching / Fix --cache-none with loops/lazy etc (Resubmit) (comfyanonymous#10440) * execution: fold in dependency aware caching This makes --cache-none compatiable with lazy and expanded subgraphs. Currently the --cache-none option is powered by the DependencyAwareCache. The cache attempts to maintain a parallel copy of the execution list data structure, however it is only setup once at the start of execution and does not get meaninigful updates to the execution list. This causes multiple problems when --cache-none is used with lazy and expanded subgraphs as the DAC does not accurately update its copy of the execution data structure. DAC has an attempt to handle subgraphs ensure_subcache however this does not accurately connect to nodes outside the subgraph. The current semantics of DAC are to free a node ASAP after the dependent nodes are executed. This means that if a subgraph refs such a node it will be requed and re-executed by the execution_list but DAC wont see it in its to-free lists anymore and leak memory. Rather than try and cover all the cases where the execution list changes from inside the cache, move the while problem to the executor which maintains an always up-to-date copy of the wanted data-structure. The executor now has a fast-moving run-local cache of its own. Each _to node has its own mini cache, and the cache is unconditionally primed at the time of add_strong_link. add_strong_link is called for all of static workflows, lazy links and expanded subgraphs so its the singular source of truth for output dependendencies. In the case of a cache-hit, the executor cache will hold the non-none value (it will respect updates if they happen somehow as well). In the case of a cache-miss, the executor caches a None and will wait for a notification to update the value when the node completes. When a node completes execution, it simply releases its mini-cache and in turn its strong refs on its direct anscestor outputs, allowing for ASAP freeing (same as the DependencyAwareCache but a little more automatic). This now allows for re-implementation of --cache-none with no cache at all. The dependency aware cache was also observing the dependency sematics for the objects and UI cache which is not accurate (this entire logic was always outputs specific). This also prepares for more complex caching strategies (such as RAM pressure based caching), where a cache can implement any freeing strategy completely independently of the DepedancyAwareness requirement. * main: re-implement --cache-none as no cache at all The execution list now tracks the dependency aware caching more correctly that the DependancyAwareCache. Change it to a cache that does nothing. * test_execution: add --cache-none to the test suite --cache-none is now expected to work universally. Run it through the full unit test suite. Propagate the server parameterization for whether or not the server is capabale of caching, so that the minority of tests that specifically check for cache hits can if else. Hard assert NOT caching in the else to give some coverage of --cache-none expected behaviour to not acutally cache. * Small readme improvement. (comfyanonymous#10442) * WIP way to support multi multi dimensional latents. (comfyanonymous#10456) * Update template to 0.2.2 (comfyanonymous#10461) Fix template typo issue * feat(api-nodes): network client v2: async ops, cancellation, downloads, refactor (comfyanonymous#10390) * feat(api-nodes): implement new API client for V3 nodes * feat(api-nodes): implement new API client for V3 nodes * feat(api-nodes): implement new API client for V3 nodes * converted WAN nodes to use new client; polishing * fix(auth): do not leak authentification for the absolute urls * convert BFL API nodes to use new API client; remove deprecated BFL nodes * converted Google Veo nodes * fix(Veo3.1 model): take into account "generate_audio" parameter * convert Tripo API nodes to V3 schema (comfyanonymous#10469) * Remove useless function (comfyanonymous#10472) * convert Gemini API nodes to V3 schema (comfyanonymous#10476) * Add warning for torch-directml usage (comfyanonymous#10482) Added a warning message about the state of torch-directml. * Fix mistake. (comfyanonymous#10484) * fix(api-nodes): random issues on Windows by capturing general OSError for retries (comfyanonymous#10486) * Bump portable deps workflow to torch cu130 python 3.13.9 (comfyanonymous#10493) * Add a bat to run comfyui portable without api nodes. (comfyanonymous#10504) * Update template to 0.2.3 (comfyanonymous#10503) * feat(api-nodes): add LTXV API nodes (comfyanonymous#10496) * Update template to 0.2.4 (comfyanonymous#10505) * frontend bump to 1.28.8 (comfyanonymous#10506) * ComfyUI version v0.3.67 * Bump stable portable to cu130 python 3.13.9 (comfyanonymous#10508) * Remove comfy api key from queue api. (comfyanonymous#10502) * Tell users to update nvidia drivers if problem with portable. (comfyanonymous#10510) * Tell users to update their nvidia drivers if portable doesn't start. (comfyanonymous#10518) * Mixed Precision Quantization System (comfyanonymous#10498) * Implement mixed precision operations with a registry design and metadate for quant spec in checkpoint. * Updated design using Tensor Subclasses * Fix FP8 MM * An actually functional POC * Remove CK reference and ensure correct compute dtype * Update unit tests * ruff lint * Implement mixed precision operations with a registry design and metadate for quant spec in checkpoint. * Updated design using Tensor Subclasses * Fix FP8 MM * An actually functional POC * Remove CK reference and ensure correct compute dtype * Update unit tests * ruff lint * Fix missing keys * Rename quant dtype parameter * Rename quant dtype parameter * Fix unittests for CPU build * execution: Allow a subgraph nodes to execute multiple times (comfyanonymous#10499) In the case of --cache-none lazy and subgraph execution can cause anything to be run multiple times per workflow. If that rerun nodes is in itself a subgraph generator, this will crash for two reasons. pending_subgraph_results[] does not cleanup entries after their use. So when a pending_subgraph_result is consumed, remove it from the list so that if the corresponding node is fully re-executed this misses lookup and it fall through to execute the node as it should. Secondly, theres is an explicit enforcement against dups in the addition of subgraphs nodes as ephemerals to the dymprompt. Remove this enforcement as the use case is now valid. * convert nodes_recraft.py to V3 schema (comfyanonymous#10507) * Speed up offloading using pinned memory. (comfyanonymous#10526) To enable this feature use: --fast pinned_memory * Fix issue. (comfyanonymous#10527) --------- Co-authored-by: comfyanonymous <[email protected]> Co-authored-by: Alexander Piskun <[email protected]> Co-authored-by: comfyanonymous <[email protected]> Co-authored-by: ComfyUI Wiki <[email protected]> Co-authored-by: Jedrzej Kosinski <[email protected]> Co-authored-by: Benjamin Lu <[email protected]> Co-authored-by: Jukka Seppänen <[email protected]> Co-authored-by: Kimbing Ng <[email protected]> Co-authored-by: blepping <[email protected]> Co-authored-by: rattus128 <[email protected]> Co-authored-by: DELUXA <[email protected]> Co-authored-by: Jodh Singh <[email protected]> Co-authored-by: Christian Byrne <[email protected]> Co-authored-by: ThereforeGames <[email protected]> Co-authored-by: Kohaku-Blueleaf <[email protected]> Co-authored-by: Changrz <[email protected]> Co-authored-by: Guy Niv <[email protected]> Co-authored-by: Yoland Yan <[email protected]> Co-authored-by: Rui Wang (王瑞) <[email protected]> Co-authored-by: AustinMroz <[email protected]> Co-authored-by: Koratahiu <[email protected]> Co-authored-by: Finn-Hecker <[email protected]> Co-authored-by: filtered <[email protected]> Co-authored-by: thorsten <[email protected]> Co-authored-by: Daniel Harte <[email protected]> Co-authored-by: Arjan Singh <[email protected]> Co-authored-by: chaObserv <[email protected]> Co-authored-by: Faych <[email protected]> Co-authored-by: Rizumu Ayaka <[email protected]> Co-authored-by: contentis <[email protected]>
Node
ControlNetInpaintingAliMamaApplywas tested after conversion:Link to the sources of the
ControlNetApplyAdvancednode that was copy-pasted intoControlNetInpaintingAliMamaApplynode:ComfyUI/nodes.py
Lines 874 to 900 in bbd6830
Objects git diff: